Raise HPACK safety from network input -- prevent SIGSEGV on malformed HEADERS#13
Open
jcmallery wants to merge 34 commits into
Open
Raise HPACK safety from network input -- prevent SIGSEGV on malformed HEADERS#13jcmallery wants to merge 34 commits into
jcmallery wants to merge 34 commits into
Conversation
…nd regions Global buffer pool for octet buffers with three size classes: Small (16B, max 64 free), Medium (1KB, max 32), Large (16KB, max 32). Lock-free via CAS (Treiber stack): sb-ext:cas on SBCL, sys:compare-and-swap on LW, generic lock fallback. allocate-buffer returns fill-pointer arrays with fp = requested size. (length buffer) returns the requested size; array-total-size returns the size-class capacity. deallocate-buffer resets fp before pooling. with-resource-usage-region: scope-based deallocation for buffers that escape local management (e.g., HEADERS continuation closures). region-track-buffer transfers ownership to the region. Measured on SBCL (50 H2 GET + 10 POST, Apple M3): Small: 2 allocated, 2402 recycled (1201:1) Medium: 5 allocated, 1419 recycled (284:1) Large: 1 allocated, 9 recycled (9:1)
- Add to .asd and package.lisp - Move documentation to mgl-pax. - Align file and package name - Use buffer-pool package from core
HTTP-STREAM-ERROR now derives from ERROR. There is a specific subclass to be raised when we receive the stream error from the peer, as opposed from when we detect it.
New generic function QUEUE-FRAME-REGION to write only part of the provided data. Falls back to QUEUE-FRAME. WRITE-VECTOR-FRAME can be now used to send prepared data vector without copying it, providing QUEUE-FRAME-REGION is specialized for the connection to avoid copying. QUEUE-FRAME-REGION uses WRITE-VECTOR-FRAME now.
There were two functions with same interpretation and it did not work.
Relocate the method, scheme, authority, path slots from server-stream up to the http2-stream common superclass. Both server-stream and client-stream instances now inherit the slots (and their accessor generic functions). Rationale: log-closed-stream in core/frames/headers.lisp calls (get-path stream), (get-method stream) etc. on any h2-stream during http-stream-error handling. The slots were only on server-stream, so these calls signalled no-applicable-method-error on client streams, killing client reactors. Moving to http2-stream keeps the existing server-stream API (all slots still accessible via the same accessor GFs) while making the accessors work uniformly on client streams for diagnostics and debugging. Backward-compatible: server-stream instances retain all slots via inheritance, and no existing code paths change.
…aders Add request pseudo-header slots to client-stream Should fix LOG-CLOSED-STREAM on client streams (presently raises error)
Frame parsing, HPACK encoding/decoding, stream management, and connection processing had no optimization declarations. Default optimization runs generic dispatch, type checks, and no inlining on the H2 framing hot path. 13 core files: classes, frames (main + 6 frame types), hpack, stream-based-connections, payload-streams, pipe, utils. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mode line attribute said Package: http2/resources but the in-package form uses http2/buffer-pool. Genera reads the mode line to set the package, so the mismatch causes compilation in the wrong package.
FIND-JUST-STREAM-BY-ID returns the keyword :CLOSED when the stream-id is absent from the streams table (the stream was already closed and reaped). APPLY-STREAM-PRIORITY then ran the default method, which calls (SETF (GET-WEIGHT stream) WEIGHT) and (SETF (GET-DEPENDS-ON stream) ...) -- neither has a method specialised on a keyword. Symptom in production: a PRIORITY-bearing frame (or a HEADERS frame with the priority flag) for a closed stream made the H2 writer thread die with SIMPLE-ERROR -- No applicable methods for #<STANDARD-GENERIC-FUNCTION (SETF GET-WEIGHT) ...> with args (WEIGHT :CLOSED) RFC 9113 sec 5.5 explicitly permits PRIORITY frames on closed streams, and RFC 9218 deprecates stream priority entirely, so the right behaviour is "do nothing" -- there is no per-stream state to update. The library already follows this pattern for the three other generics that FIND-JUST-STREAM-BY-ID's keyword return value reaches: - update-window-size (eql :closed) - peer-resets-stream (eql :closed) - get-peer-window-size (eql :closed) apply-stream-priority was missing one. Adding it fixes the crash without changing behaviour on live streams.
… HEADERS
The HPACK decoder runs under (safety 0) in two places:
1. Per-file (declaim (optimize (speed 3) (safety 1))) at top of hpack.lisp.
2. decode-huffman-to-stream's inner update-vars flet has
(declare (optimize (safety 0))). Its outer flet block also runs
under (optimize speed (safety 0) space).
That combination means the aref into the input octet buffer inside
update-vars runs with no bounds check. A HEADERS frame whose huffman-
encoded contents end in a way that drives idx past end without hitting
the (= idx end) guard reads past the buffer. In production on a
LispWorks 8.1 server (rmcs1, 2026-05-22, while serving an MP4 with
multiplexed sub-requests on the same H2 connection), this produced:
SYSTEM::EXCEPTION-ERROR -- Segmentation violation (SIGSEGV) at
offset 936 in HTTP2/HPACK:DO-DECODED-HEADERS.
A SIGSEGV from network input is a stability hazard: it kills the H2
writer thread without giving the existing handler-bind in the reactor
a chance to send RST_STREAM or GOAWAY. The recoverable Lisp
condition that should have been signalled instead is hidden by the
safety-0 declamations.
Fix is minimal:
- File-level declaim drops safety 1, leaving it at the implementation
default (typically 3 on LispWorks and SBCL). speed 3 retained.
- update-vars's (safety 0) raised to (safety 1).
- Guard tightened from (= idx end) to (>= idx end) so any future code
path that could increment idx without re-checking still bails out
safely instead of reading past the buffer.
- Outer flet block's (safety 0) raised to (safety 1).
After this change, a malformed HEADERS frame signals a recoverable
condition that the H2 reactor's handler-bind catches and treats as
a stream protocol error -- RST_STREAM on the offender, the rest of
the multiplexed connection continues normally.
Verified on LispWorks 8.1.2 + CL-HTTP 73.0: hpack.lisp recompiles
clean, H2 GET / HEAD / POST against redirect-dispatch fixtures all
return correct status codes (HPACK decode produces correct
(name, value) tuples), and a Safari-mimicking H2 request with eight
custom headers parses without error.
Owner
|
I definitely do not want crashes on malformed network input, so I would check and fix in next release. Regardles, if there is a known network input that leads to the fault, I would prefer to catch it so that I can close the connection with proper error (A receiver MUST terminate the connection with a connection error (Section 5.4.1) of type COMPRESSION_ERROR if it does not decompress a field block.). Do you know what was the trigger? Is the change of |
Author
|
Thanks — three answers:
1. Trigger / captured frame.
We don't have a captured malformed frame. The SIGSEGV killed the LispWorks writer thread before our reactor's handler-bind could
intercept and log the input. The host was rmcs1 (a production LispWorks 8.1 server) on 2026-05-22, serving an MP4 with several
multiplexed sub-requests on the same H2 connection — real-world internet traffic, likely a bot probe, a browser-quirk request, or a
coalesced/corrupted CONTINUATION block. The stack frame was HTTP2/HPACK:DO-DECODED-HEADERS + 936.
Now that safety is raised, a malformed HEADERS frame should propagate as a normal Lisp condition that we can log + RST_STREAM. We'll be
watching for recurrence and will share the offending bytes if/when we capture one.
2. Is = → >= sufficient?
I don't think so. By inspection, (= idx end) should suffice in normal operation — update-vars only ever increments idx by 1 inside the
guarded block, so under correct execution idx cannot skip past end. The fact that the SIGSEGV happened anyway is the smoking gun that
the bounds check itself wasn't being honored.
The load-bearing fix is raising safety. With (safety 0) declared on both update-vars's inner flet and the outer flet block, the
implementation is licensed to assume the programmer's invariants and may discard the conditional return-from, reorder the aref past it,
or otherwise turn the bounds-violating aref into raw pointer arithmetic. That's how a check that looks correct on the page produces a
segfault on the wire.
= is belt-and-suspenders — it covers the case of caller passing start > end (e.g. a CONTINUATION header block fragment with a malformed
length), which = would miss. But the safety raise is what guarantees the bounds check actually runs.
I'd keep both: >= for cheap defensive coverage, and the safety raise for the real fix.
3. RFC §5.4.1 COMPRESSION_ERROR.
Agreed — that's exactly the contract the safety raise restores. Pre-fix, a malformed field block produced a SIGSEGV that bypassed all
condition handling. Post-fix, the same input signals a normal aref bounds error that propagates up to the H2 reactor's handler-bind,
which can now classify it as a stream protocol error (or, if you prefer, a connection-level error and emit GOAWAY with
COMPRESSION_ERROR). Happy to add a specific compression-error condition class in the HPACK decoder if you want — it would let the
reactor distinguish "this stream's headers were malformed" from generic decode failures and produce the RFC-mandated GOAWAY.
— John
… On May 24, 2026, at 11:29, Tomas Zellerin ***@***.***> wrote:
zellerin
left a comment
(zellerin/http2#13)
<#13 (comment)>
I definitely do not want crashes on malformed network input, so I would check and fix in next release.
Regardles, if there is a known network input that leads to the fault, I would prefer to catch it so that I can close the connection with proper error (A receiver MUST terminate the connection with a connection error (Section 5.4.1 <https://httpwg.org/specs/rfc9113.html#ConnectionErrorHandler>) of type COMPRESSION_ERROR <https://httpwg.org/specs/rfc9113.html#COMPRESSION_ERROR> if it does not decompress a field block.). Do you know what was the trigger? Is the change of = to >= sufficient?
—
Reply to this email directly, view it on GitHub <#13 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AO4AKMX73CDLILUELPSRDT344MBMNAVCNFSM6AAAAACZJCOIXCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DKMRZGAYTGOBQGI>.
Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The HPACK decoder in
core/hpack.lispruns under reduced safety in twoplaces:
(declaim (optimize (speed 3) (safety 1)))at the topof the file lowers safety to 1 across all of hpack.lisp, including
the cross-function dispatch in
do-decoded-headersandread-http-header.decode-huffman-to-stream: the innerupdate-varsflet has(declare (optimize (safety 0))), and its outer flet block runsunder
(declare (optimize speed (safety 0) space)).The
aref bytes idxinsideupdate-varstherefore has no boundscheck. A HEADERS frame whose huffman-encoded contents end in a way
that drives
idxpastendwithout hitting the(= idx end)guardreads past the buffer.
Hit in production on a LispWorks 8.1 server (2026-05-22), while serving
an MP4 video with multiplexed sub-requests on the same H2 connection.
The crash:
The fault registers showed
x0 = x1 = <Lisp-tagged value>beingdereferenced as an array pointer, with neighbouring registers holding
character-tagged values where octets should be -- consistent with the
decoder reading out-of-band data interpreted under the wrong type
assumption.
A
SIGSEGVfrom network input is a stability hazard. It kills theH2 writer thread before the existing reactor handler-bind can send
RST_STREAMon the offending stream. The recoverable Lisp conditionthat should have been signalled is hidden by the safety-0
declamations.
Fix
Minimal, four lines of effective change:
declaimdrops(safety 1), leaving safety at theimplementation default (typically 3 on LispWorks and SBCL).
speed 3retained.update-varsinner declaration raised from(safety 0)to(safety 1).(= idx end)to(>= idx end)so any futurecode path that could increment
idxpastendstill bails outsafely instead of reading past the buffer.
(safety 0)to(safety 1).After this change, a malformed HEADERS frame signals a recoverable
condition that the H2 reactor's handler-bind catches and treats as a
stream protocol error --
RST_STREAMon the offender, the rest ofthe multiplexed connection continues.
Performance cost
Negligible. The hot path in
decode-huffman-to-streamis thecase-based dispatch over the huffman prefix tree (huge generatedcond), not the byte fetch. Adding bounds checking on the singlearefper byte adds one comparison per byte, dwarfed by the dispatchcost.
The change to the v2.0.5 branch already removes the file-level
(declaim (optimize (speed 3) (safety 1)))-- this PR brings the samesafety improvement back to master for users who haven't migrated yet.
Verification
Tested on LispWorks 8.1.2 + CL-HTTP 73.0:
hpack.lisprecompiles cleanly with no warnings about reachabilityor type uncertainty
302, 303, 305, 307, 308) on GET, HEAD, and POST. All HPACK decodes
produce correct (name, value) tuples
User-Agent,Accept,Accept-Language,Accept-Encoding,Sec-Fetch-Dest,Sec-Fetch-Mode,Sec-Fetch-Site,Priority) parses withouterror and returns expected response
Not tested:
the change should apply identically, but no separate verification
run was done for this PR)
Note on v2.0.5
The v2.0.5 branch already drops the file-level
(declaim (optimize (speed 3) (safety 1)))per Tomas's cleanups. This PR is for usersstill on master. If you'd prefer to merge only into master and let
v2.0.5 supersede it naturally, that's fine -- the per-function flet
changes in
decode-huffman-to-streamare independent of thefile-level declaim and worth picking up either way.